Add LoRA handling for image generation#4084
Conversation
mask field should only be accepted in image edit (inpainting) requests, not in text-to-image generation requests.
…s and demo review update
…n, fix docs and includes
- Add --source_loras CLI parameter for specifying LoRA adapters Format: alias=org/repo@file.safetensors (comma-separated, @file optional) - Add LoRA adapter entries to image_gen_calculator.proto - Parse and validate LoRA settings in image_generation_graph_cli_parser - Export LoRA adapter entries in graph.pbtxt generation - Load LoRA .safetensors via ov::genai::Adapter in pipelines.cpp - Apply LoRA adapters at inference time based on model name routing - Download LoRA repos via curl (resolve safetensors filename from HF API) - Add LoRA alias routing in mediapipe factory - Pass modelName through HttpPayload for LoRA alias matching - Add 18 unit tests (CLI parsing, graph export, proto parsing, config)
- Support multiple LoRA source types: HF repo, direct URL, local file (alias= required) - Extract shared curl_downloader utility from gguf_downloader - Add composite LoRA aliases (e.g. blend=@pokemon:0.7+@anime:0.5) - Support per-request lora_weights override in extra_body - Local files referenced by absolute path in graph.pbtxt (no copy) - HF LoRA: resolve .safetensors via API, download with curl - clone() delegates to pullLoraAdapters() for all LoRA downloads - resolveHfLoraFilenames() + pullLoraAdapters() split (private -> protected) - Remove loraQueue: T2I/I2I always use clone(), only inpainting serialized - PipelineSlotGuard (renamed from InpaintingQueueGuard) - compileProperties built once in constructor (no default arg on reshapeAndCompile) - CompositeLoraMap type alias replaces duplicate runtime structs - Multiline composite formatting in graph.pbtxt - Add RUN_UNSTABLE-gated pull tests for LoRA (HF resolve, download, full-flow) - Add non-network unit tests (local file skip, non-imagegen no-op) - Add SetUpServerForDownloadWithLoras test helper - 59 tests pass (55 original + 4 new, 3 network-gated skip without RUN_UNSTABLE)
Resolved conflicts in: - pipelines.hpp: keep PipelineSlotGuard name and LoRA fields, adopt main's blocking comment - pipelines.cpp: keep LoRA adapter loading, compileProperties, adopt main's SPDLOG_ERROR - http_image_gen_calculator.cc: keep LoRA logic, adopt main's const ref for inpainting tensors - README.md: accept main's updated examples (model names, sizes, notes)
- Fix downloadFileWithCurl: use overload instead of const ref default parameter (was binding temporary to const std::string&) - Add HF_TOKEN auth header to curl downloads for HF repos only (avoid leaking credentials to arbitrary DIRECT_URL servers) - Rename authToken -> authTokenHF for clarity - Skip RUN_UNSTABLE tests when HF_TOKEN is not set - Provide explicit safetensors filename in download tests - Restore missing 'curl -O' PNG download commands in image_generation README - Update copilot-instructions rule 13: expanded dangling reference guidance - Add missing #include <vector>, <utility> (cpplint) - Fix comment spacing (cpplint) - clang-format all changed files
MSVC /W4 treats variable shadowing as error (C4456). Inner loop variable 'it' shadowed outer pipelinesMap iterator.
- Detect Windows absolute paths (e.g. C:\path\to\file.safetensors)
in addition to Unix paths (/ and ./ prefixes)
- Also detect .\ prefix for relative Windows paths
- Use find_last_of("/\\") instead of rfind('/') to extract
filename from both Unix and Windows paths
…ting_lora # Conflicts: # src/mediapipe_internal/mediapipegraphdefinition.cpp # src/mediapipe_internal/mediapipegraphdefinition.hpp # src/pull_module/BUILD # src/server.cpp # src/test/graph_export_test.cpp
…ll_hf_models - Fix demos/image_generation/README.md: use adapter alias as model name instead of base model + lora_weights for LoRA selection - Fix guidance_scale: 0 -> 0.0 (OVMS rejects integer values) - Fix docs/image_generation/reference.md: clarify model name routing as the adapter selection mechanism, document blending via composite adapters - Fix docs/model_server_rest_api_image_generation.md: clarify lora_weights only overrides weights of already-active adapters - Add docs/pull_hf_models.md: section on pulling image gen models with LoRA
- Add static isValidLoraAlias() in CLI parser to sanitize LoRA alias names (alphanumeric, hyphens, underscores, dots only) - Add ServableNameChecker collision detection in mediapipefactory when registering LoRA aliases (reject if alias shadows model/pipeline/graph name) - Revert file_system_poll_wait_seconds default to 1 and sequence_cleaner_poll_wait_minutes default to 5 - Fix missing HfDownloaderPullHfModel test fixture after merge
- NPU detected: set AdapterConfig::MODE_STATIC, skip runtime adapter switching - Reject composite LoRA adapters on NPU (runtime switching unavailable) - Warn when multiple LoRAs configured on NPU (all compiled permanently) - Rename npuLoraFused -> npuLoraStaticMode for accuracy - Add CLI LoRA parsing tests: alias validation, source types, composites - Add pbtxt composite LoRA test in text2image_test - Add local file path tests (Unix absolute, Windows behind ifdef)
… weight->alpha - Add aliasesConflict() to ServableNameChecker interface for LoRA alias collision detection during graph validation (before factory lock) - Implement aliasesConflictExcluding() in MediapipeFactory with shared_lock - Validate aliases in mediapipegraphdefinition validate() after initializeNodes - Simplify createDefinition alias loop (checks moved to validate phase) - Update reloadDefinition to clear+re-register aliases on reload - NPU LoRA calculator rejection: reject requests to main graph name when npuLoraStaticMode is active (direct client to use alias) - Multi-LoRA NPU: require composite_lora_adapters definition (hard error) - Multi-LoRA NPU calculator: only composite aliases accepted as targets - Rename CompositeLoraComponent.weight -> alpha across proto/struct/CLI/export - Rename npuLoraFused -> npuLoraStaticMode - Register composite aliases for routing in image_gen_node_initializer - Fix fmt formatting of resolution_t in imagegen_init.cpp log statements
…ting_lora # Conflicts: # docs/pull_hf_models.md
Add LoraLoadMode enum to proto and C++ to support different LoRA loading strategies: - DYNAMIC (default): Runtime-switchable adapters - STATIC: Static rank compilation - FUSE: Permanently merge LoRA into base weights at compile time FUSE adapters are compiled separately and excluded from runtime alias registration. DYNAMIC/STATIC adapters remain switchable at generate time. Also fixes composite LoRA alias registration (skip FUSE adapters) and adds tests for the new functionality.
- Rename expectedImageGenNpuFuse → expectedImageGenNpuStatic in tests - Fix NPU error message: 'fused' → 'static' in imagegen_init.cpp - Fix CLI parser comment: clarify STATIC mode for NPU adapters - All 44 LoRA tests passing
There was a problem hiding this comment.
Pull request overview
This PR extends OVMS image generation to support LoRA adapters end-to-end: CLI parsing (--source_loras) and graph export, LoRA download during HF pull, pipeline compilation with adapters, per-request adapter selection via model name routing (including composites), and alias-based routing/visibility in the MediaPipe factory.
Changes:
- Add LoRA adapter definitions (single + composite) to ImageGen graph proto, parsing, and graph export/CLI plumbing.
- Implement HF pull support for LoRA adapters (HF repo resolution + download; direct URL/local path support via CLI parsing).
- Add runtime routing support for LoRA aliases (MediaPipe alias registration/hide-base-model behavior) and request-level
lora_weightsoverrides.
Reviewed changes
Copilot reviewed 44 out of 44 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| src/test/text2image_test.cpp | Adds pbtxt parsing tests for LoRA adapter fields. |
| src/test/test_utils.hpp | Declares new server test helpers for pull/start with LoRAs and REST port. |
| src/test/test_utils.cpp | Implements new server test helpers (threaded start with LoRA args). |
| src/test/pull_hf_model_test.cpp | Adds HF pull + LoRA tests and a large unstable pull/serve/generate integration test. |
| src/test/ovmsconfig_test.cpp | Adds config parsing tests for invalid/valid --source_loras combinations. |
| src/test/graph_export_test.cpp | Adds extensive graph export + CLI-to-settings tests for LoRA/composites/source types. |
| src/stringutils.hpp | Declares isLocalFilePath. |
| src/stringutils.cpp | Implements isLocalFilePath (Unix + Windows absolute + ./ .\). |
| src/server.cpp | Adjusts HF pull module casting to call non-const clone(). |
| src/servable_name_checker.hpp | Extends checker interface with alias-conflict detection. |
| src/pull_module/hf_pull_model_module.hpp | Makes clone() non-const; exposes LoRA resolve/pull helpers (protected). |
| src/pull_module/hf_pull_model_module.cpp | Adds HF API resolution for LoRA safetensors + downloads during clone(). |
| src/pull_module/gguf_downloader.cpp | Refactors curl download logic to shared curl downloader helper. |
| src/pull_module/curl_downloader.hpp | New shared curl download helper API. |
| src/pull_module/curl_downloader.cpp | New curl downloader implementation (progress + optional auth header). |
| src/pull_module/BUILD | Adds curl_downloader target; wires into pull module deps. |
| src/modelmanager.hpp | Implements new aliasesConflict API. |
| src/modelmanager.cpp | Adds alias conflict checks across models/pipelines/mediapipe definitions. |
| src/mediapipe_internal/mediapipegraphdefinition.hpp | Stores discovered LoRA aliases + hide-base-model flag. |
| src/mediapipe_internal/mediapipegraphdefinition.cpp | Validates LoRA alias conflicts; propagates LoRA routing metadata from node init. |
| src/mediapipe_internal/mediapipefactory.hpp | Adds alias→graph mapping and helper methods. |
| src/mediapipe_internal/mediapipefactory.cpp | Registers LoRA aliases for lookup/listing; hides base model when requested. |
| src/mediapipe_internal/graph_side_packets.hpp | Extends side packets with LoRA aliases and hide-base-model flag. |
| src/image_gen/pipelines.hpp | Adds adapter/composite storage; renames queue guard to PipelineSlotGuard. |
| src/image_gen/pipelines.cpp | Loads adapters and compiles pipelines with adapter properties; tracks NPU/static mode. |
| src/image_gen/imagegenutils.cpp | Allows lora_weights in accepted request fields. |
| src/image_gen/imagegenpipelineargs.hpp | Adds LoRA adapter + composite settings to pipeline args. |
| src/image_gen/imagegen_init.cpp | Parses LoRA adapter and composite entries from ImageGenCalculatorOptions. |
| src/image_gen/image_gen_node_initializer.cpp | Registers LoRA aliases into graph side packets; sets hide-base-model. |
| src/image_gen/image_gen_calculator.proto | Adds LoRA adapter and composite adapter proto fields + load mode enum. |
| src/image_gen/http_image_gen_calculator.cc | Applies LoRA selection per request (model routing) + optional lora_weights. |
| src/http_rest_api_handler.cpp | Persists resolved model name into HttpPayload. |
| src/http_payload.hpp | Adds modelName to payload for downstream routing logic. |
| src/graph_export/image_generation_graph_cli_parser.cpp | Adds --source_loras parsing (repo/url/local + composites + alpha). |
| src/graph_export/graph_export.cpp | Emits LoRA adapter entries (and composite entries) into generated graph.pbtxt. |
| src/cli_parser.cpp | Adds --source_loras CLI option and stores it in HF settings. |
| src/capi_frontend/server_settings.hpp | Adds LoRA settings types + HFSettingsImpl::sourceLoras. |
| src/BUILD | Adds cpp-httplib + image_generation_graph_cli_parser deps to tests. |
| docs/pull_hf_models.md | Documents --source_loras for pull mode. |
| docs/model_server_rest_api_image_generation.md | Documents lora_weights request field. |
| docs/image_generation/reference.md | Adds LoRA adapter usage docs (routing, composites, overrides). |
| demos/image_generation/README.md | Adds Multi-LoRA serving examples; improves inpainting/outpainting notes. |
| demos/common/export_models/export_model.py | Adds --source_loras support for exporting image generation configs (with LoRA download). |
| .github/copilot-instructions.md | Updates guidance about avoiding dangling refs in default args. |
Comments suppressed due to low confidence (1)
src/test/test_utils.cpp:850
- This overload builds
argvusingport.c_str()whereportis a local variable, andargvitself is a stack array. The server thread may outlive this function, so the argument pointers can dangle (use-after-scope). Please ensure argument storage outlives the thread (heap-owned vectors captured by value).
| // All adapters were registered at compile time (alpha=1.0 each). | ||
| // At generate time we must explicitly set the adapter config: | ||
| // - If modelName matches a composite alias: activate all component adapters with their weights. | ||
| // - If modelName matches a single adapter alias: activate that adapter. | ||
| // - Otherwise: disable all adapters (alpha=0) so the base model runs clean. | ||
| // lora_weights from request body can override default weights. |
There was a problem hiding this comment.
It is already updated.
- Add validateLoraAdapterConfig() for alpha consistency between individual and composite levels (error if both non-default) - NPU validation: composites required for multi-LoRA, all adapters must be referenced, consistent alpha across composites - Fix Windows drive letter colon in CLI parser (lastColon > 1) - Document LoRA adapter modes (DYNAMIC/STATIC/FUSE) in reference.md - Document --source_loras format, alpha, source type detection - Add tests: alpha at individual/composite/both levels, explicit 1.0, Windows absolute path with alpha
| // See the License for the specific language governing permissions and | ||
| // limitations under the License. | ||
| //***************************************************************************** | ||
| #include "curl_downloader.hpp" |
There was a problem hiding this comment.
Extracted from GGUF_downloader
There was a problem hiding this comment.
Those are almost the same. Can we make a base curlDownloader class ?
There was a problem hiding this comment.
The common functionality is already extracted into the downloadFileWithCurl() free function in curl_downloader.cpp, which both GGUF and LoRA paths call. The orchestration around it differs: GGUF handles multi-part file resolution and overwrite-remove logic; LoRA handles source-type dispatch (HF repo / URL / local),
| **FUSE mode:** | ||
| - The adapter is merged into base weights during model compilation using `MODE_FUSE`. | ||
| - It is always active — the base model without the adapter is **not accessible**. | ||
| - Does not appear in the list of routable adapters and cannot be selected or deselected via the `model` field. |
There was a problem hiding this comment.
Only adapter is available.
| SttServableMap sttServableMap; | ||
| TtsServableMap ttsServableMap; | ||
| std::vector<std::string> loraAliases; | ||
| bool hideBaseModel = false; |
There was a problem hiding this comment.
Please describe what is it used for.
| ASSERT_EQ(std::get<Status>(res), ovms::StatusCode::PLUGIN_CONFIG_CONFLICTING_PARAMETERS); | ||
| } | ||
|
|
||
| // ===================== LoRA Graph Export Tests ===================== |
There was a problem hiding this comment.
I suggest adding a new file for those specific lora tests. I do not see if we reuse much from this file ?
There was a problem hiding this comment.
removeVersionString().
I see it is already created in 2 places (both graph_export & hf_pull tests so I will extract it and share across all thre files then)
| uint16_t n = 3; | ||
| testResponseFromOvTensor(n); | ||
| } | ||
| // ===================== LoRA Proto Parsing Tests ===================== |
There was a problem hiding this comment.
I suggest adding new file with this tests.
There was a problem hiding this comment.
In this case I am not convinced - loras are mainly for image generation and here we test basically the same (proto parsing)
| std::vector<std::string> loraAliases_; | ||
| bool hideBaseModel_ = false; |
This includes:
-> LoRA pulling
-> multiple LoRA handling
-> NPU LoRA handling